Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case

نویسندگان

  • Jocelyn Holden Bolin
  • W. Holmes Finch
چکیده

Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kinetic Monte Carlo Study of Biodiesel Production through Transesterification of Brassica Carinata Oil

In the present study, the kinetics of biodiesel production through transesterification of Brassica carinata oil with methanol in the presence of Potassium Hydroxide is investigated by kinetic Monte Carlo simulation. The obtained results from simulation agree qualitatively with the existing experimental data. The kinetics data for each step of suggested mechanism are confirmed by simulation. By ...

متن کامل

Applying Point Estimation and Monte Carlo Simulation Methods in Solving Probabilistic Optimal Power Flow Considering Renewable Energy Uncertainties

The increasing penetration of renewable energy results in changing the traditional power system planning and operation tools. As the generated power by the renewable energy resources are probabilistically changed, the certain power system analysis tolls cannot be applied in this case.  Probabilistic optimal power flow is one of the most useful tools regarding the power system analysis in presen...

متن کامل

Population dynamic of Acipenser persicus by Monte Carlo simulation model and Bootstrap method in the southern Caspian Sea (Case study: Guilan province)

In this study population dynamic of Acipenser persicus with age structure model by Monte Carlo and Bootstrap approach was studied. Length frequency data a total of 4376 specimens collected from beach seine, fixed gill net and conservation force in coastal Guilan province during 2002 to 2012. Data imported to FiSAT II for length frequency analyze by ELEFAN 1. K, L∞ and t0 estimated 203, 0.08 and...

متن کامل

A New Approach for Monte Carlo Simulation of RAFT Polymerization

In this work, based on experimental observations and exact theoretical predictions, the kinetic scheme of RAFT polymerization is extended to a wider range of reactions such as irreversible intermediate radical terminations and reversible transfer reactions. The reactions which have been labeled as kinetic scheme are the more probable existing reactions as the theoretical point of view. The ...

متن کامل

Evaluation of glandular dose in mammography in the presence of breast cyst using Monte Carlo simulation

Introduction: Average glandular dose (AGD), entrance skin air kerma (ESAK) and normalized glandular dose (DgN) are the main dosimetric quantities in mammography. In this study, DgN is evaluated in the presence of breast cyst, which is a common disease among women and the influence of size, number and location of the cysts on the DgN is investigated. Materials and Meth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2014